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Abstract — Loop Calculus introduced in [1], [2] constitutes 
a new theoretical tool that explicitly expresses the symbol 
Maximum-A-Posteriori (MAP) solution of a general statistical 
inference problem via a solution of the Belief Propagation (BP) 
equations. This finding brought a new significance to the BP 
concept, which in the past was thought of as just a loop-free 
approximation. In this paper we continue a discussion of the Loop 
Calculus. We introduce an invariant formulation which allows to 
generalize the Loop Calculus approach to a q-are alphabet. 

The manuscript is organized as follows. In Section Q] we 
introduce a new formulation of the Loop Calculus in terms of 
a set of gauge transformations that keep the partition function 
of the problem invariant. The full expression contains two 
terms referred to as the "ground state" and "excited states" 
contributions. The BP equations are interpreted as a special 
(BP) gauge fixing condition that emerges as a special orthog- 
onality constraint between the ground and the excited states. 
Stated differently, it selects the generalized loop contributions 
as the only ones that survive among the excited states. In 
Section [TT] we demonstrate how the invariant interpretation of 
the Loop Calculus, introduced in the Section I, allows a natural 
extension to the case of a general g-ary alphabet. This is 
achieved via a loop tower sequential construction. The ground 
level in the tower is exactly equivalent to assigning one color 
(out of q available) to the "ground state" and considering all 
"excited" states to be colored in the remaining (q — 1) colors, 
according to the loop calculus rule. Sequentially, the second 
level in the tower corresponds to selecting a loop from the 
previous step, colored in (q— 1) colors, and repeating the same 
ground vs excited states partitioning procedure into one and 
the remaining (q — 2) colors, respectively. The construction 
proceeds until the complete set of (q — 1) levels in the 
loop tower (including the corresponding contributions to the 
partition function) is established. In Section [Til] we discuss an 
ultimate relation between the loop calculus and the Bethe free 
energy variational approach of [3]. 

We start with defining a statistical inference problem using 
the so-called Forney-style graphical model formulation [4], 
[5]. The basic graph, Co = (Vo,£q), is described in terms 
of vertices, Vo = {a} and edges, £o = {(ab)}. Variables, 
associated with the edges, assume their values in a g-ary 
alphabet, o a b = aba = 0, • • • , (q — 1). The probability of 
a given configuration of variables a = {a ao \(ab) £ £q} on 



the entire graph is described by 

a er a 

where Zc a is the normalization coefficient, also known as the 
partition function; f a (ca) is an arbitrary positive function of 
the variables, cr a = {cr a { ) |& G a,£o}, associated with all edges 
attached to vertex a. b € a (or conversely a £ b) indicates 
that the vertices b and a share an actual edge of the graph, 
(ab) 6 £o- The marginal probabilities, e.g. associated with 
edges and vertices, 

Pa{Va) = ^ P(°0> Pab{Vab) = ^ ^ 
o-\<r a rr\a ab 

constitute what one normally needs to evaluate in order to 
solve a statistical inference problem. The marginal probabil- 
ities can be also expressed in terms of derivatives of the so- 
called equilibrium free energy, Tc = — \nZc , with respect 
to relevant parameters of the factor functions. 

I. Gauge-Invariant Formulation of Loop Calculus 

Formally, loop calculus suggests an explicit decomposition 
of the partition function Zc in terms of a sum over certain 
loops on the graph Co- Below we re-derive the loop calculus 
in more general terms compared to [1], [2]. 

We start with an observation that the partition function, 
Zc , is invariant with respect to a group of linear gauge 
transformations of the factor functions 

fa{(7a = ((Jab, ' ' ■ )) ~ ► ^ G a b (Cab, Cr'ab) fai^'ab, "■)) ( 3 ) 

described by G — {G a b{&ab, cr' ab ); (ab) S £0} provided the 
pairs of conjugated matrices G a b and G& a are related to each 
other by the special constraint 

J2 G ab{°ab, cr')G ha {a ab , a") = 5{a' , a"), (4) 

Cab 

where S(x, y) is 1 if x = y and 0, otherwise. Except as 
prescribed by Eq. ([4j, the gauges are chosen independently at 
different edges of the graphs. This local freedom in selecting 



G is the key to our further analysis of the partition function, 
Zc now expressed as 



= e n e mo n o 



bea 



p{G\a} = Tt {p{G\ct} 



(5) 



where <7 a h = <7b a . We will refer to summation over all allowed 
configurations of cr in Eq. (0 as computing a graphic trace: a 
conventional trace can be considered as a special case of the 
graphic trace for a graph that consists of a single vertex and a 
single edge. Our next step in evaluation of Eq. (0 is fixing the 
gauges, which means imposing constraints on G in addition 
to Eq. ©. 

It is convenient to distinguish a special term in the sum/trace 
over cr in Eq. (0 with all a ab = 0. We will refer to this term 
as the ground or, alternatively, uncolored state (term), while 
all the other terms in the sum, which contain at least one edge 
with a ab > 0, are called excited (colored) states. Obviously 
for a general gauge choice G all kinds of excited states, e.g. 
with only one edge being excited/colored, provide nonzero 
contributions to Z. Discussing individual terms in the er-sum 
in Eq. (0 we call a vertex colored if at least one edge attached 
to it is excited/colored. 

A BP-gauge corresponds to such a special choice of G that 
makes vanish any contribution in the er-sum in Eq. (0 that 
has at least one vertex with only one attached colored edge. 
Stated differently a BP-gauge prohibits loose excited/colored 
edges at any vertex. Formally it is expressed as the following 
set of conditions 

E U^')G^ b P \a ab + 0, a' ab ) J] G™(0, a' ac ) = 0, (6) 

cr' a c£a 

enforced independently at any vertex of the graph. Combined 
with the constraints (0, Eq. (0 can be re-stated in the vector 
form depending only on the ground state part of the gauges: 



c^b 



1 E 



with 



P. 



(7) 



(8) 



We can alternatively derive Eq. (0 for BP gauges using a 
variational approach. To that end we introduce a functional 



Z (i)^p{G\0} = l[p a (e a ) 



(9) 



where e ab ((j ab ) = G q6 (0, er Qb ), e a = {e ab \b e a, V }, e = 
{e a b\(ab) e £ }, and p a (e a ) is given by Eq. (0 with Gic P) (0) 
replaced by e ac . The conditions for the stationary points of 
Tq = — In Zq with respect to e, under the additional condition, 
J2a Gofc(0, er)G 6a (0, cr) = 1, recovers Eqs. (0. Note that the 
functional ^-"o(e) as well as the BP equations d7l8t possess 




Fig. 1. Example of a factor graph, Co with fourteen possible generalized 
loops, Q(Co) = {Ci}, shown in bold on the bottom. 



some remaining irrelevant gauge freedom with respect to a 
set of transformations e ab — > K ab e ab with n ab K ba — 1. Stated 
differently, the BP equations fix only the relevant part of the 
gauge freedom. A connection between the functional J-o(e) 
and the variational Bethe free energy will be established in 
Section imi 

The conventional form of the BP equations, in terms of the 

"messages" i] a b{v a b), 



eX P (Vab\vab) + vl b a P \vab) 

E^exp(^(a ab )+^K b )) 
EcvAcr,, fa(tr a ) exp [J2bea vlT (°ab 



J2cr a /o(o-a) exp (j^bea Vab^ (°ab] 

is recovered using the following parametrization 

exp(?7 Qb (cr)) 



e Q b = G a b(0,cr) 



, (10) 



(11) 



VT,a eX P WA°) + Vba{<j)) 

Our discussion has been applied so far to the case of a 
general g-ary alphabet. We now turn to the simplest binary 
case q = 2, where the ground state parametrization (fTTI) 
unambiguously fixes the excited states: G ab (l,a) = (1 — 
2a)G ba (0, (cr — l) 2 ). Substituting the latter expression and 
Eqs. (1617b into Eq. (0 we arrive at the main formula of the 
loop calculus for the binary alphabet 

Z Co = Z 0;Co (l+Y}(Ci)), rid^Z-^piGlac,), 



Ci 

Z 0;C„ = p(G\a ), CT = {(T ab = 0| (06) G C }, 
, cr a b = 1| (ab) S C\ 



(12) 



<j ab = 0\ (ab)€C \Ci. 

where {C±} = 0(Cq) is the set of generalized loops on the 
graph, defined as subgraphs of Cq without loose ends, i.e. 



with degree of connectivity at any vertex (within the subgraph) 
being two or larger. 

Beliefs are defined here as substitutes for the exact marginal 
probabilities (O truncated at the first, ground state, term 

&^W) = G^(0,a ab )G^(0,a ab ), (13) 



(14) 



E ffo /«wn^C(o.^) 

Then a single generalized loop contribution, rc\ , is expressed 
in terms of the ground state beliefs in the following simple 
way 
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The loop calculus construction for a simple example is illus- 
trated schematically in Fig. [TJ 

II. Loop Tower for q-ary alphabet 

Turning to the general g-ary alphabet case we first notice 
that all considerations and formulas of the introduction and 
the first part of Section [I] all the way up to Eq. (0, actually 
apply to the general g-ary case. Partitioning the sum/trace 
over er in Eq. <(5j into the ground-state term, with er , and 
the remaining excited-state terms {er \ er }, and emergence 
of the self-consistent set of equations for the ground state 
gauges (flTT i are important general features of the gauge fixing 
construction. Of course, all the preceding formulas should be 
understood in terms of the edge variables that assume values 
from {0, • ■ • , q — 1}. Generalization of Eq. (fl2] i to a general 
q-ary alphabet reads 



Z Co = Z , Co +Y, Z ^ Z C = $>(G« |<t Ci ). (15) 

Ci£i?(C ) °-ch 



The additional summation over the colored/excited crc 1 in 
Eq. (fl3T > is a consequence of the fact that for q > 2, er^ , is not 
fixed unambiguously, but rather represents summation over the 
reduced (q — l)-colors rich set, 1, • • • , q — 1. The BP-gauges 
for the original graphical model are described by Eqs. (14161 1. 
The set of excited states gets larger in the q-aiy case and, 
consequently, there is a big freedom in selecting the orthogonal 
basis set of excited gauges. Selecting one such solution of 
Eqs. <@]6), {G^ b % ((X ab ,a' ab y, (ab) E C }, and substituting it 
in Eq. ( fl5l l we find that Zq x becomes the partition function 
of a reduced graphical model, defined on a subset C\ C Co 
of Co, 

Zct = II /lio(o"o;Ci), fl-A^a-d) = (16) 

<t Ci aSCi 

=e/«k) n g Eo(<^x 6 ) n ^°)- 

er' a hea.Ca hGa,C 



Fig. 2. Example of a loop tower construction for three colors, q = 0, 1, 2, 
shown in the Figure in black, red and blue respectively. First layer of the 
tower is bounded by the red dashed box, with the original graph, Co, shown 
in black and three generalized loops, {Ci} = £1(Cq), shown in red. On 
the second layer of the tower each graph from Q ( Co ) generates its own set 
of generalized loops. The next layer of generalized loops, shown in blue, 
are bounded by three dashed blue boxes with red graph in a box showing 
respective element of {Ci}. 



Here (Ta-d is the vector constructed of a ab with b € C\, 
with the components labeled by {1, • • • ,q — 1}. Zc t may 
be understood as a partition function of a reduced graphical 
model, defined on the graph C\ in terms of a reduced (one 
element shorter) alphabet and with the factor functions /i ;a . 

This reformulation of the partition function of the original 
problem in terms of a sum of partition functions over reduced 
graphical problems can be repeated sequentially: Zc — * 
Z Cl -> Zc 2 -> • • • Z Cq _ x where C D Ci D ■ ■ ■ D C q - 2 
is the tower of loops and Zcj is the partition function of the 
graphical model defined in terms of a (q — j)-ary variables on 
the graph Cj, which is a generalized loop of Cj_i. All together 
one arrives at Eq. ( fl5l l supplemented by the sequence 



J- 



'-Z 



(17) 



(18) 



Generalization of Eq. ([Tol l becomes 

Z Cj =E II fr,a{<r a ;Cj), 
fj;a{cr a . Cj )= ^2 h-1;a{o-' a . Cj l ) 

"• C 3-l 

x n n ^K b ,j-i), 

where crc. is a vector constructed out of the variables defined 
on all edges of the graph Cj with the components labeled 

by {it" ' 7<Z — !}■ The BP gauges in Eqs. ([TBI . G^ c ., are 
solutions of Eqs. (14|6t with the original factor functions, / = 
fo replaced by f r 



III. Relation to the Bethe free energy approach 

It is known that the exact (equilibrium) free energy of any 
classical statistical model can be obtained from a variational 
principle based on an exact non-equilibrium variational func- 
tional of the full belief, b(<r), 

?. X act{K<j)} = E b (°-) ln tt f! \ ■ (19) 

fj 11a JayCa) 

The only stationary point of the functional under the normal- 
ization condition 



5>(<t) = 1, 



(20) 



reproduces the probability distribution b(cr) = p(cr), where 
p(er) is defined by Eq. ([TJ. This stationary point is actually a 
minimum. The value of the exact variational functional at its 
minimum is equal to the exact free energy 



^exact{p((r)} = F = -hxZ, 



(21) 



where Z = Zc , the latter defined above by Eq. ([T). Hereafter 
we skip the graph, Co, index to simplify notations. 
Introducing an approximate variational ansatz 



b{a) 



OAK) 



(22) 



where b a and b a b are approximations for the corresponding 
(exact) marginal probabilities, we substitute it (in the spirit of 
[3]) into Eq. ( fl9] l. We further invoke another approximation 
(both approximations are actually exact in the case of a tree, 
i.e. a graph with no loops) 

ba{<Ta) « E & (°')' M°- a b) « (23) 
cr\cr a v\<T a b 

This results in the so-called Bethe (approximate) free energy 
functional of beliefs b a (cr a ), b ac (a ac ): 



„ „ \Ja(CTa) 



~ E E bab ( CTah ) 111 bab ( fTah ) ■ 

(ah) a ab 



(24) 



We require the beliefs to obey the positivity, normalizability 
and compatibility constraints, the features borrowed from the 
corresponding exact probabilities given by Eqs. (f2]). Thus, we 
have V a, c; c G a (and inversely a G c): 



< (""a) ) ^ac (cr ac ) < 1, 

E ba(tr a ) = 1) E b ^b( <J ab) = 1, 



(25) 
(26) 



6ac(CTa C )=E &a ^' Ta )' frac(o-ca)=E & c(o"c)- (27) 

ff a \(T ac (T c \(T c(l 



To establish a connection between the Bethe free energy 
and the functional JT we introduce the effective Lagrangian 

C B ethe = E E M°"a) In ( ^7^4 ) 

- E E bab ( aab ) ln &afc ( crafc ) 

(afc) (Tab 

Ek^Ks)) E m^)) 

(afc) \CT a6 ^ <T a \o- ab 

+ E m ( e 6a(c6a)) I b ab (<r ba ) 



2^ 6 fc (*rb) 1 I , (28) 

that depends on all beliefs that satisfy the normalization 
constrains (l26l i with no constraints on e a b{<^ab)- Requiring 
vanishing of the variation with respect to e a fc obviously leads 
to the constraints given by Eq. d27| >, and once all constraints 
are fulfilled the functional does not depend on e a fc (which 
should be considered as gauge symmetry) and coincides with 
^ Bethe as a function of the beliefs. This implies a one-to-one 
correspondence between the extrema of C Bethe and Bethe free 
energy Bethe- 

Finding extrema of Csethe with respect to the beliefs 
(this can be technically achieved by introducing Lagrange 
multipliers for the set of constrains d26i i) leads to beliefs that 
depend explicitly on £ = {e a b\{ab) G £q}: 

b[;\(T a ) = (Qaiea))- 1 fa{<Ta)\{eab{cTab) (29) 

fcGa 

&a* (<7ab) = £ l a b 1 (£a6,£6a)£a6(0'a6)£6a(<7afc), (30) 

Qa(£a) =E^°( e °) II £ ac(0"ac), ( 31 ) 



Qab{Sabi £ba) 



E 



)£6a(Cafc), 



(32) 



where e a = {e a fc; a G b, Vb}- Substituting the values of beliefs 
given by Eqs. (1291301 1 into Eq. d28l results in a functional that 
depends on the £ variables only 

Qa \£a 

) + E ln (Qab(£gb, gfcq))) • (33) 
a (ab) 

The functional Tb possesses strong gauge symmetry: it is 
invariant under a set of transformations e a fc K a &e a fc- The 
gauge can be partially fixed by implementing a gauge (nor- 
malization) condition 



E £ ab{Oab)£ba{?ab) — 1. 



(34) 



Implementing this constraint, the second term in Eq. (|33| ) 
vanishes. This means that switching from the notations of 
Section U to our current notations, e — > e and p — > g, we 
arrive at To = Tb- Stated more formally, To introduced earlier 
represents the gauge-invariant functional Tb in a particular 
gauge determined by Eq. (l34l l. This implies a one-to-one 
correspondence of the extrema of To to the extrema of Lsethe, 
and therefore to the extrema of the Bethe free energy § Bethe- 



IV. Discussions and Conclusions 

We first summarize the results presented in the manuscript. 
We have introduced a group of gauge transformations that 
keep the partition function of the graphical model invariant, 
and naturally split the gauges into the "ground" and "excited" 
parts. The partition function is decomposed into the principal 
ground and many excited terms. Each excited contribution is 
interpreted in terms of an excited subgraph constructed from 
excited edges. Requiring that only excited subgraphs with 
no loose ends contribute to the partition function sets the 
BP equations for the ground gauges. We show that the BP 
equations can be derived using a variational principle for the 
partition function as a function of the ground gauges. Further 
consideration differs for the binary and q-ary alphabets. In 
the binary case the excited gauges are fixed unambigiously, 
generating the binary loop series over generalized loops for 
the partition function [1], [2], [6], In the q-ary case we pick 
one (of many possible) excited gauges and presenting the 
full partition function as a sum over generalized loops. Each 
contribution labeled by a generalized loop can be viewed as 
a new graphical model defined on this loop with a new set of 
factor functions. The loop decomposition procedure is applied 
again, introducing new ground and excited gauges, fixing the 
gauges, etc. The procedure repeated for (q — 1) layers builds 
a g-store loop tower. Finally, we showed that the BP-gauges 
can be determined using a variational principle and related 
the corresponding functional JT to the Bethe free energy 
functional constructed in the spirit of [3]. 

These results open new venues for further development, 
and also raise a set of important and challenging questions 
listed below. (1) Already the lowest level BP equations in the 
loop tower, the ground BP-gauge may have multiple solutions. 
Our construction applied to different solutions will generate 
different loop decompositions for the partition function. The 
question is, whether a preferred solution is in a way better then 
the others? Naive intuition suggests that BP gauge with the 
highest value of Zc would serve better. (2) Furthermore, in 
the case of a g-ary alphabet with q > 2 positivity of the factor 
functions at higher tower levels is not guaranteed. The positiv- 
ity would be desirable for interpreting the auxiliary graphical 
problems as some actual statistical inference problems, with 
the factor functions related to probabilities. On the other hand, 
there is a big freedom in selecting the excited gauges, and a 
question surfaces: could one select the excited gauges in a way 
to guarantee positivity of the higher-level factor functions? (3) 
The BP ground state contribution to the partition function, 
Zc is positive by construction, however the signs of the 
excited terms can alternate. This raises a couple of important 
questions. How do the signs of the loop terms depend on 
the factor functions and the graphical model itself? Based 
on our previous experiments [6], we know that emergence of 
an excited loop contribution comparable to the ground state 
alerts for a possible failure of BP as an approximation to 
exact inference. How exactly does the sign alternation and 
relative value of the tower loop contributions affect success 



or failure of BP as an approximation? (4) The equilibrium 
Bethe free energy estimates the value of the partition function, 
however the variational derivation sketched in Section [Till does 
not guarantee that the resulting Tq is actually larger then 
the exact T . Indeed in the transition from Eqs. (11 91221231 to 
Eqs. (l24l i we further discuss the latter formulation completely 
ignoring the fact that the conditions d23l can be violated for 
the resulting BP solutions. How does this violation affect the 
relation, T ^ J-q, and what are the consequences of this 
inequality for the loop series? 

We conclude with mentioning some future research direc- 
tions. As demonstrated in [6], the loop calculus is suggestive 
of an efficient truncation of the full series that can poten- 
tially improve the BP approximation. This idea can also be 
extended to the q-ary alphabet case, with the tower truncated 
at some relatively low level. This approach can obviously 
find interesting application in decoding of non-binary codes 
and also in problems, such as computer vision, that require 
a multi-valued data reconstruction. The loop tower approach 
can also be extended to the analogous case of continuous 
alphabet. In this case the ground state gauges satisfy a set 
of integral equations, while the ground and excited states 
that define the gauges become elements of functional infinite- 
dimensional Hilbert spaces, which makes the tower heights 
unlimited and the tower loop decomposition turns into an 
infinite series. Finally, we note that the gauge conditions may 
be chosen in some other non-BP way. BP-gauge is of a special 
importance for dilute locally tree-like graphs simply because 
in the loop-free case the whole loop hierarchy (the entire loop 
tower) disappears. One could conjecture that for some other 
classes of graphical models, e.g. those naturally defined on 
regular lattices, similar cancelations can take place for some 
alternative specially selected gauges. 
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